Brown Dwarf: A Distributed Data Warehouse for the Cloud

نویسندگان

  • Katerina Doka
  • Dimitrios Tsoumakos
  • Nectarios Koziris
چکیده

In this paper we present the Brown Dwarf, a distributed system designed to efficiently store, query and update multidimensional data over commodity network nodes, without the use of any proprietary tool. Brown Dwarf manages to distribute a highly effective centralized structure among peers on-the-fly, reducing cube creation and query times by enforcing parallelization. Both point and aggregate queries as well as updates are naturally performed on-line through cooperating nodes that hold parts of a fully or partially materialized data cube. The system also employs an adaptive replication scheme that expands or shrinks the units of the distributed data structure for minimal storage consumption against failures and load skew. Brown Dwarf collects many of the features of an application to be deployed in the Cloud: It adapts its resources according to demand, allows for on-line, fast and efficient storage/processing of large amounts of data and is cost-effective both over the required hardware and software components. Our system has been evaluated on both actual and simulation-based testbeds. To outline the findings of our extensive experimentations, Brown Dwarf manages to accelerate cube creation up to 5 times and querying up to several tens of times by exploiting the capabilities of the available network nodes working in parallel. Incurring only a small storage overhead compared to the centralized algorithm, it distributes the structure pretty evenly across the overlay nodes. It manages to quickly adapt even after sudden bursts in load and remains unaffected with a considerable fraction of frequent node failures. These advantages are even more apparent for dense and skewed datacubes and workloads.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming

The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...

متن کامل

Cloud formation in substellar atmospheres

Clouds seem like an every-day experience. But – do we know how clouds form on brown dwarfs and extra-solar planets? How do they look like? Can we see them? What are they composed of? Cloud formation is an old-fashioned but still outstanding problem for the Earth atmosphere, and it has turned into a challenge for the modelling of brown dwarf and exo-planetary atmospheres. Cloud formation imposes...

متن کامل

Low mass T Tauri and young brown dwarf candidates in the Chamaeleon II dark cloud found by DENIS

We define a sample designed to select low-mass T Tauri stars and young brown dwarfs using DENIS data in the Chamaeleon II molecular cloud. We use a star count method to construct an extinction map of the Chamaeleon II cloud. We select our low-mass T Tauri star and young brown dwarf candidates by their strong infrared color excess in the I − J/J − Ks color-color dereddened diagram. We retain onl...

متن کامل

Spectroscopy of Brown Dwarf Candidates in the NGC 1333 Molecular Cloud

We present an analysis of low-resolution infrared spectra for 25 brown dwarf candidates in the NGC 1333 molecular cloud. Candidates were chosen on the basis of their association with the high column density cloud core, and near-infrared fluxes and colors. We compare the depths of water vapor absorption bands in our candidate objects with a grid of dwarf, subgiant, and giant standards to determi...

متن کامل

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009